Speech Overlap and Interplay with Disfluencies in Political Interviews

نویسندگان

  • Gilles Adda
  • Martine Adda-Decker
  • Claude Barras
  • Philippe Boula de Mareüil
  • Benoît Habert
  • Patrick Paroubek
چکیده

The reported study focuses on overlapping speech, transcription, annotation and disfluency analysis in an 8-hour audio corpus of French political interviews. Overlaps are frequent (on average 3-4 overlaps per minute) and of short duration (5% of data), non-intrusive overlaps being significantly shorter than intrusive ones. Disfluencies include repetitions, revisions and filled pauses. Manual annotation achieved a higher inter-annotator agreement when based on the four overlap types: back-channel, turn request, anticipated turn taking and complementary. Discourse markers are also considered in this study. The disfluency rate in overlaps is almost double of the one in non-overlapping speech. Repetitions are the most involved disfluency type, especially for intrusive overlaps (turn requests and complementary). The study highlights interesting differences between active (incoming) and passive (floor holding) overlap speakers, as well as between journalists and interviewees.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Annotation and analysis of overlapping speech in political interviews

Looking for a better understanding of spontaneous speech-related phenomena and to improve automatic speech recognition (ASR), we present here a study on the relationship between the occurrence of overlapping speech segments and disfluencies (filled pauses, repetitions, revisions) in political interviews. First we present our data, and our overlap annotation scheme. We detail our choice of overl...

متن کامل

A quantitative study of disfluencies in French broadcast interviews

The reported study aims at increasing our understanding of spontaneous speech-related phenomena from sibling corpora of speech and orthographic transcriptions at various levels of elaboration. It makes use of 9 hours of French broadcast interview archives, involving 10 journalists and 10 personalities from political or civil society. First we considered press-oriented transcripts, where most of...

متن کامل

Detection of disfluencies in speech signal

During public presentations or interviews, speakers commonly and unconsciously abuse interjections or filled pauses that interfere with speech fluency and negatively affect listeners impression and speech perception. Types of disfluencies and methods of detection are reviewed. Authors carried out a survey which results indicated the most adverse elements for audience. The article presents an ap...

متن کامل

A disfluency study for cleaning spontaneous speech automatic transcripts and improving speech language models

The aim of this study is to elaborate a disfluent speech model by comparing different types of audio transcripts. The study makes use of 10 hours of French radio interview archives, involving journalists and personalities from political or civil society. A first type of transcripts is press-oriented where most disfluencies are discarded. For 10% of the corpus, we produced exact audio transcript...

متن کامل

Synthesising Uncertainty: The Interplay of Vocal Effort and Hesitation Disfluencies

As synthetic voices become more flexible, and conversational systems gain more potential to adapt to the environmental and social situation, the question needs to be examined, how different modifications to the synthetic speech interact with each other and how their specific combinations influence perception. This work investigates how the vocal effort of the synthetic speech together with adde...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007